A UTILITY DEVIATION IN DISCOUNTED MARKOV DECISION PROCESSES WITH GENERAL UTILITY
نویسندگان
چکیده
منابع مشابه
Discounted Markov decision processes with utility constraints
-We consider utility-constrained Markov decision processes. The expected utility of the total discounted reward is maximized subject to multiple expected utility constraints. By introducing a corresponding Lagrange function, a saddle-point theorem of the utility constrained optimization is derived. The existence of a constrained optimal policy is characterized by optimal action sets specified w...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملSatiation in Discounted Utility
In this paper, we propose a model of intertemporal choice that explicitly incorporates satiation due to previous consumption in the evaluation of the utility of current consumption. In the discounted utility (DU) model, the utility of consumption is evaluated afresh in each time period. In our model, the utility of current consumption represents an incremental utility from the past level. When ...
متن کاملContinuous Time Markov Decision Processes with Expected Discounted Total Rewards
Abstract. This paper discusses continuous time Markov decision processes with criterion of expected discounted total rewards, where the state space is countable, the reward rate function is extended real-valued and the discount rate is a real number. Under necessary conditions that the model is well defined, the state space is partitioned into three subsets, on which the optimal value function ...
متن کاملMarkov Decision Processes with General Discount Functions
In Markov Decision Processes, the discount function determines how much the reward for each point in time adds to the value of the process, and thus deeply a ects the optimal policy. Two cases of discount functions are well known and analyzed. The rst is no discounting at all, which correspond to the totaland average-reward criteria. The second case is a constant discount rate, which leads to a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bulletin of informatics and cybernetics
سال: 1996
ISSN: 0286-522X
DOI: 10.5109/13455